We consider optimizations that are required for efficient execution of code segments that consists of loops over distributed data structures. The PARTI A execution time primitives are designed to carry out these optimizations and' can be used to implement a wide range of scientific algorithms on distributed memory machines. These primitives allow the user to control array mappings in a way that gives an appearance of shared memory. Computations can be based on a global index set. Primitives are used to carry out gather and scatter operations on distributed arrays. Communications patterns are derived at runtime, and the appropriate send and receive messages are automatically generated. i Aocession For NTIS GRA&
Distributed Shared Memory (DSM) systems have been proposed as a way of combining the programmability...
While parallel programming is needed to solve large-scale scientific applications, it is more diffic...
In prior work, we have proposed techniques to extend the ease of shared-memory parallel programming ...
Optimizations are considered that are required for efficient execution of code segments that consist...
This paper describes a number of optimizations that can be used to support the efficient execution o...
Sparse matrix-vector (SpMV) multiplication is a widely used kernel in scientific applications. In th...
Sparse system solvers and general purpose codes for solving partial differential equations are examp...
We present the design and implementation of a parallel algorithm for computing Gröbner bases on dist...
Increased programmability for concurrent applications in distributed systems requires automatic supp...
In adaptive irregular problems the data arrays are accessed via indirection arrays, and data access ...
On shared memory parallel computers (SMPCs) it is natural to focus on decomposing the computation (...
In recent years, distributed memory parallel machines have been widely recognized as the most likely...
this article we investigate the trade-off between time and space efficiency in scheduling and execut...
The parallelization of complex, irregular scientific applications with various computational require...
OpenMP has emerged as the de facto standard for writing parallel programs on shared address space pl...
Distributed Shared Memory (DSM) systems have been proposed as a way of combining the programmability...
While parallel programming is needed to solve large-scale scientific applications, it is more diffic...
In prior work, we have proposed techniques to extend the ease of shared-memory parallel programming ...
Optimizations are considered that are required for efficient execution of code segments that consist...
This paper describes a number of optimizations that can be used to support the efficient execution o...
Sparse matrix-vector (SpMV) multiplication is a widely used kernel in scientific applications. In th...
Sparse system solvers and general purpose codes for solving partial differential equations are examp...
We present the design and implementation of a parallel algorithm for computing Gröbner bases on dist...
Increased programmability for concurrent applications in distributed systems requires automatic supp...
In adaptive irregular problems the data arrays are accessed via indirection arrays, and data access ...
On shared memory parallel computers (SMPCs) it is natural to focus on decomposing the computation (...
In recent years, distributed memory parallel machines have been widely recognized as the most likely...
this article we investigate the trade-off between time and space efficiency in scheduling and execut...
The parallelization of complex, irregular scientific applications with various computational require...
OpenMP has emerged as the de facto standard for writing parallel programs on shared address space pl...
Distributed Shared Memory (DSM) systems have been proposed as a way of combining the programmability...
While parallel programming is needed to solve large-scale scientific applications, it is more diffic...
In prior work, we have proposed techniques to extend the ease of shared-memory parallel programming ...